Picture for Ngai Wong

Ngai Wong

OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond

Add code
May 19, 2026
Viaarxiv icon

AIS: Adaptive Importance Sampling for Quantized RL

Add code
May 13, 2026
Viaarxiv icon

ROMER: Expert Replacement and Router Calibration for Robust MoE LLMs on Analog Compute-in-Memory Systems

Add code
May 12, 2026
Viaarxiv icon

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Add code
Apr 11, 2026
Viaarxiv icon

CodeComp: Structural KV Cache Compression for Agentic Coding

Add code
Apr 11, 2026
Viaarxiv icon

MathGen: Revealing the Illusion of Mathematical Competence through Text-to-Image Generation

Add code
Mar 31, 2026
Viaarxiv icon

Model Evolution Under Zeroth-Order Optimization: A Neural Tangent Kernel Perspective

Add code
Mar 22, 2026
Viaarxiv icon

Beyond Outliers: A Data-Free Layer-wise Mixed-Precision Quantization Approach Driven by Numerical and Structural Dual-Sensitivity

Add code
Mar 18, 2026
Viaarxiv icon

Can We Trust LLMs on Memristors? Diving into Reasoning Ability under Non-Ideality

Add code
Mar 14, 2026
Viaarxiv icon

Efficient Generative Modeling with Unitary Matrix Product States Using Riemannian Optimization

Add code
Mar 12, 2026
Viaarxiv icon